This report is prepared with the following environmental settings.
print(R.version)
## _
## platform x86_64-apple-darwin15.6.0
## arch x86_64
## os darwin15.6.0
## system x86_64, darwin15.6.0
## status
## major 3
## minor 5.1
## year 2018
## month 07
## day 02
## svn rev 74947
## language R
## version.string R version 3.5.1 (2018-07-02)
## nickname Feather Spray
First, I process the raw textual data ‘lyrics.RData’ saved in $ data $ file by cleaning data, removing stopwords, blanks and creating a tidy version of texts which is saved in $ output $ file.
When we listen to music, we not only feel the beauty of the melody, but also find resonance from the lyrics. The style of lyrics often corresponds to features of different types of music. So What are the characteristics of lyrics of each kind of music? Do different types of music share something in common on their lyrics? And what’s the difference?
So I try to explore these questions by comparing the most frequently used words and bigrams of different types of song lyrics. I hope that I can picture some topics from each genre’s song lyrics.
First I look at the overall style from all song lyrics.
Overall, the most frequently used words are “love”, “time”, “baby”,“heart”,“ill”. By looking into the network of bigrams, we can visualize some details of the text structure, the darker the arrow, the higher the frequency of the connection between the two words. Here for all the song lyrics, we can see “heart”, “love”, “baby” form common centers of nodes, and “love baby”, “fall love”, “close eyes”, “heart beat/broken” appear most often in pairs.
This shows that the lyrics generally lean towards the theme of love, whether sweet or sad.
Let’s go deep into each type of music to explore the lyric features.
For Rock music, “love”, “time”, “ill”, “baby”, “day” are most commonly used words. From the bigrams network, besides love-related pairs, we can see the iconic word “rock roll” in the top five. It’s also worth noting that bigrams like “love love”, “time time”, “baby baby”, “day day”, “run run”, “dance dance” appear most often, I think that’s because these elements can express the strong rhythm of Rock music.
For Pop music, I think it’s the most love-related genres. Because we can see “love baby”, “fall love” appear most frequently, maybe that’s why we usually hear pop music in the wedding.
For Metal, “time”, “life”, “die”, “eyes”, “world” become most common in lyrics, the most common pairs like “close eyes”, “live life”, “set free”, “deep inside”, “fire burn” seem to express a sense of freedom and power, which exactly coincides with the emphatic rhythm of Metal music.
For Hip-Hop, the lyrics become more like street freestyle, since besides “love”, “time”, “girl”, “baby”, even dirty words like “shit”, “bitch”, “ass” become very commonly used. Also from the bigrams network, “love love”, “yo yo”, “baby baby”, “boom boom”, “bang bang”???those words reflect the features of rap, a rhythmic and rhyming speech that is chanted.
For Country, “love” appears way more frequently than any other word, also from the above bigrams network, the lyrics particularly focus around “love” and “heart”, pairs like “true love”, “fall love”, “ill love”, “sweet love”, “break heart” always occur in lyrics of Country music, which shows that it usually create songs with themes such as sweetness of love, the pain of losing love, the expectation for the warm life etc.
As for Jazz, I think the lyric features are very similar to Country music, although they play with a totally different styles, Jazz tends to be blue while country music is more cheerful.
For Electronic, although love is still the most frequently used word, from the bigrams network above, it’s no longer the common center of node, instead, wo can see “air somethings”, “funk soul”, “soul brother” mostly appear.
The lyric features of R&B also seem to focus aroud “love”.
As for Indie, “love” and “home” seem to be two main themes.
As for Folk, bigrams like “dee dee”, “dog food”, “dear goofy”, “wack fall”, “fat road” seem to form a kind of storytelling lyric.
Next, I conduct sentiment analysis of each genres of music, to see the emotion in the lyrics of each music genre.
Here I use AFINN and bing as my lexicon, the first one assigns words with a score that runs between -5 and 5, with negative scores indicating negative sentiment and positive scores indicating positive sentiment and I use it to calculate the total score of each genre’s sentiment, the bing lexicon categorizes words in a binary fashion into positive and negative categories, and I use it as a tool to label each word in lyrics.
From the above wordcloud, the size of a word’s text is in proportion to its frequency within its sentiment. We can see the most important positive words are “love”, “smile”, “free”, “shine”,“sweet”, while most common negative words are “die”, “lie”, “fall”,“break”, “cry” etc.
From the above rank of the sentiment score, lyrics of Rock, Hip-Hop and Metal convey the most negative emotion, while lyrics of Jazz, Country, Folk tend to have much more positive emotion.
We can also see from those histograms that the distribution of sentiment scores of Rock, Hip-Hop, Metal skew to the right significantly, while the sentiment of Jazz, Country are slightly positive.